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Abstract 

This article proposes a family of link functions for the multinomial response 
model. The link family includes the multicategorical logistic link as one of its 
members. Conditions for the local orthogonality of the link and the regression 
parameters are given. It is shown that local orthogonality of the parameters in 
a neighbourhood makes the link family location and scale invariant. Confidence 
regions for jointly estimating the percentiles based on the parametric family of 
link functions are also determined. A numerical example based on a combina- 
tion drug study is used to illustrate the proposed parametric link family and 
the confidence regions for joint percentile estimation. 
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1 Introduction 



In this article we address two issues related to multinomial response models, (i) a 
family of link functions and (ii) percentile estimation under a parametric link family. 
In the first few sections we propose a family of link functions for multinomial nominal 
response models. When working with multinom ial data sets t he common practice 

pp. 267-274). How- 



is to fit the multicategory logis tic link function ( lAgresti 



2002 



ever, ICzado and Santnerl (119921 ) show that if the link function is incorrectly assumed 
then it leads to biased estimates thus increasing the mean squared error of predic- 
tion. Using a data set based on a combination drug therapy experiment we show 
that parameter estimation is improved by using the proposed link family instead of 
the commonly used multivariate logistic link. The parametric link family proposed 
includes the multivariate logistic link as one of its members. In the later part of the 
article we discuss three methods for finding confidence regions for the percentiles of 
a multinomial response model. The confidence regions determined are based on the 
estimated values of the link parameters. 

In univariate generalized linear models (GLMs), especially for binary data, family 
of link functions have been discussed by many researchers. Some of the one a n d two 



parameter link 



197fih 



(119881); 



Pregibon 



amilie s for binary models are. 



1980 



Guerrero and Johnson 



Czado and Santnerl (1199 



J; CzadJ ( 



propo s ed by namely 



982 



); 



1993 



Aranda-Ordazl (U9811) : 



1997J); 



'rent ice 



Langi (11999h . How 



1975 



Stukel 



ever, unlike binary regression models research papers on link families for multinomial 
responses are rarely found in t he lit eratu re. T h e two parametric link families pro- 



posed by 



Genter and Farewell! ( 119851 ) and 



Lang! (119991 ) are only applicable to multi- 



nomial data sets with ordered categories. Till date we were unable to find any work 
which addresses a family of link functions for multinomial data sets with nominal re- 
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sponses. The situation is similar for percentile estimation methods i n multinomial re 
sponse mode l s. Th o ugh a huge num b er of research papers (namely, 



Carter et al. 



(119861 ) 



Williams! dl986h 



Huang! ( 1200 lh 



Hamilton! (11979); 



Biedermann et al. 



(12006! ) 



Li and Wiens 



(120 111 )) have been written on percentile estimation and effect of link misspecification 
on percentile estimation for binary data, almost no work has been done in the case of 
multinomial data. There are, however, many experimental situations where multino- 
mial responses may be observed for each setting of a group of control variables. As 
a typical example we may consider a drug testing experiment, where both the effica- 
cious and toxic responses of the drug/s are measured on the subjects. This results in 
two responses, efficacy and toxicity of the drug, both of which are binary in nature. 
Since the responses come from the same subject they ar e assumed to be correlated 



and can be modeled using a multinomial distribution (IMukhopadhyay and Khuri 



20081 ). In this situation it may be of interest to the experimenter to jointly estimate 
the lOOp percentile of the efficacy and toxic responses. In this article we discuss a 
numerical example based on the pain relieving and toxic effects of two analgesic drugs 
and determine confidence regions for the lOOp percentiles of both the responses. 

While parametric link families are able to improve the maximum likelihood fit 
when compared to canonical links, any correlation between the link and the reg ression 



param eters leads to an increase in the variances of the parameter estimates 



Czado 



( 119971 )] . However, it can be shown that if the parameters are ort 



ipgonal to each othe r 



then the variance inflation reduces to zero for large sample sizes (I Cox and Reidl. 



1987) 



Czado! (I1997h 



Conditions for local orthogonality in a neighbourhood was proposed by 
for univariate GLMs. In this article we extend these conditions so that we can apply 
them to a multiresponse situation. It is also shown that the local orthogonality of 
the parameters imply location and scale invariance of the family of link functions. 
The remainder of the article is organized as follows: In Section [2] we describe 



the family of link functions for the multinomial model. Detailed conditions of local 
orthogonality between the link and the regression parameters are given in Section 
[3j In Section [4] we discuss three interval methods for percentile estimation in a 
multinomial model. The proposed link family and confidence regions are illustrated 
with a numerical example based on a drug testing experiment in Section Concluding 
remarks are given in Section [6j 



2 A family of link functions for multinomial data 



In this section the multinomial response model with a parametric link function is 
introduced. We use a scaled version of the multinomial distribution. The following 
three components are used to describe it: 

• Distributional component: A random sample of size n, yi, . . . ,y n , is selected 
from a multinomial distribution with parameters (7Tj, n,); 7Tj = (7^1, . . . , 7Tjq), % = 



n. Th e density function o 



distribution ( iFahrmeir and Tutz 



1L. 



2001 



Yi/rii also called the scaled multinomial 



p 76) is, 



s(fi\0i 



exp < ; LOi 



(2.1) 



where 0, 



Mizs^)...-.log( I= 5^)j , 6(0,) = -log(l-E5=i^), 

log (— u n ' 1 

of observations is N = Y17=i n «- 



C(y i; 0, Ui ) = log ( , il! „,^ !(n ^ 1 --^)! ) : 



1. The total number 



Linear predictor: A q dimensional linear predictor, tj(x) = Z(x)/3, where 
Z(x) = ©j =1 fj(x), fj(x) is a known vector function of x, f3 = [/3^, . . . , /3' q ]' 
is the p x 1 vector of unknown parameters with the jth component, /3p of 
length pj and p = Y^j=iPj- 



• Parametric link function: = 7r = h(a, rj), where h(a, ■) = [hi(a, ■), ■■■, h q (at, •)]', 
o; rxl = [ct[, . . . , ac' q ]', cx.j is of length r 3 - and J^^ =1 = r. 

2.1 Proposed form of parametric link function 



Several researchers (IStukel 



1988 



Czado 



19891 ) propose the following generalization 



for binary response models with a logistic link function 



M x ) = E{y\x) = h(at,rj) 



exp{G(cx,r])} 
[l + exp{G(a,77)}]' 



where G(at, •) is a generating fa mily wi t 



using the generating family by 



i the unknown link parameter a. For example 



Czadol ( 1198a ) we get 



G(a,r)) 



(i±!fcl if ^> 

(l-r,) Q 2-l . f „ 
- i - if 77 < 0, 



where ck = [a^a^]'. Usually when modeling th e mean in a multinomial response 



model the multivariate version of the logit model (jAgresti . 



2002 



pp. 267-274) is used, 



Ti j = hj(rj) 



1 + ELi ex p(^) 



, for j = 1, . . . , q. 



An alternative form of the above model is given by using the link function g where 
g = h \ 



Vj = 9j(t*) = log ; _ ^ 9 - 

1 Z^7 = l H'j 
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Analogous to the binary case we propose the following generalization of the multicat- 
egorical logit model, 

^ = Ma, n) = ^P{^(^)} for j = 1, ... , q, (2.2) 

where G = [Gi, G2, ■ ■ ■ , G q ]', Gj(acj,-) is a generating family for binary response 
models as described above, hj(ct, ■) is the j'th component of h(o;, ■) = g~ 1 (a, •). The 
family A = {h(o;, •) : a 6 17} includes the multivariate logistic link function if there 
exists an et 6 f2 such that G(a , •) is a identity function. 

2.2 Parameter estimation 

Summing up the previous sections we can write the multinomial model with a para- 
metric link function as, 

7r(x)=h[a,77(x)], (2.3) 

where tj(x) = [»7i(x), . . . , ry (x)]' = Z(x)/3, x G R k , (3 is a p x 1 vector of unknown 
parameters and ck is r x 1 vector of unknown link parameters. Also, 

r\j = Tij{x) = fj{x)Pj = gj[a, tt(x)], for j = 1, . . . , q, (2.4) 

where g = [gi, . . . , g q }' is the inverse of h = [hi, ... , h q ]'. We use the notation <5 to 
denote the joint vector of the regression and link parameters, thus 8 = [f3 f , a']' is a 
vector of length (p + r). The parameter vector 8 is estimated using the maximum 
likelihood estimation (MLE) method. A brief description of the procedure is given as 
follows: 

Using the scaled version of the multinomial distribution as described in equation 
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(12. lp the log-likelihood function for the sample yi, . . . , y n is, 



1(8) = X>W 

i=l 
n 

= y^[yffii ~ b(0j)]ni + constant. 



i=l 



Thus the score function is (jFahrmeir and Tutzl . 1200 ll . p 436) 



(2.5) 



dl(S) 
88 
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i=l 



n P) 



i=i 



and (jFahrmeir and Tuta . 1200 ll . p 436) 



(2.6) 



8 2 l(8) 
dSdd' 



n dn d/j, n 9 8 2 9- 



88 

1=1 

H„, (say). 



i=l 3=1 



8888' 



(2.7) 



From equation (\2.7\i we get the Fisher information matrix to be 



—E 



8 2 l(8) 
8888' 



n Pi 

£f^(y 4 )] 



i=i 



-i <Hh 
88'' 



(2.8) 



For maximizing the log-likelihood the Fisher scoring iteration method is used 
which yields, 

<5 ( m+ i) =<5 M + j ?; i^p 5 (2 . 9) 
where m ind icates the mth iteration. It is also possible to use the method given in 



Stukell (119881 ) for finding an approximate MLE of 8. In this method, the MLE of (3 is 



obtained by fixing a and denoted by (3(a). An approximate MLE of 8 is then given 
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by 6 = \(3 (6c), at'}' which maximizes the log-likelihood function over a set a. 



2.3 Asymptotic distribution of d 

Suppose 6=1/3, 6c']' denotes the MLE of 6. Using equation f !2.6p and the central 
limit theorem we know ^ij^ asymptotically follows a normal distribution with mean 
and variance J n . By first order Taylor series expansion, 







dl(6) 81(6) 



86 



86 



+ 



d 2 i{6) 

8686' 



(6-6). 



This implies ( IFahrmeir and Tutzl . l200ll . p 439) 



N(6-6) 



nh; 



-i dl(6) 
86 



Thus, we get that 6 has an asymptotic normal distribution with mean 6 and variance 

3 Orthogonalization of link and regression param- 
eter vectors 



In this section we discuss certain conditions for which the link parameters are approx- 
imately orthogonal to the regression parameters in a neighbourhood asymptotically. 
In our numerical examples we show that approximate orthogonality of the parame- 
ters reduces the variance inflation of f3 while increasing the numerical stability of the 
computations. The family of link functions for which the regression parameters are 
approximately orthogon al to the link param eters in a neighbourhood are also loca- 
tion and scale invariant. iLi and Duanl (119891 ) noted the importance of a family of link 



function being location and scale invariant. In their paper they observed that for a 
unspecified link function the intercept parameter was not identified while the slope 
parameter was identified only up to a multiplicative constant. Thus any variation in 
the location and scale was absorbed by the link function. 



Proposition 1: The regression parameter vector f3 and link parameter vector 
oc are approximately orthogonal in a neighbourhood around rj , if the family of link 
functions h(o;, ■) satisfies the following conditions, (i) there exists ry and ir such 
that 

h(a,?7 ) = 7T , VaGO, (3.1) 



and (ii) there exists a s such that 

dh(ct, 77) 



drj 



I T)=V0 



s , V a E tt. (3.2) 



Proof: By first order Taylor series expansion of h(a, rj) around rj and equations 
(EI]) and (E2D, 

h(a,r/) w h(a,77 ) + — — — | r7=T7o (T7 — r7 ) (3.3) 
= tto + So^-^o)- ( 3 - 4 ) 



Equation (13.41) shows that the family of link functions h(a, •) is approximately in- 
dependent of a in a neighbourhood of rj where approximation (13. 3ft holds asymp- 
totically. Hence, if the conditions (13. ip and (13. 2p are satisfied, then the regression 
parameters (3 and the link parameters a are approximately orthogonal in a neigh- 
bourhood of 77 asymptotically. 



9 



Extending the definition of a location and scale invariant family given by ICzado 



(119971 ) to the multiple response case we state: a family A is said to be location and 
scale invariant if for every he A, the function h*(a, rf) = h(a, a + hrj) ^ A for all 
a and b ^ Of, or I&, where a — |Ol, . . . , CLq }', h = diag(bi, . . . , b q ), a is a matrix 
of the same order as a with all elements zero, Of, is a matrix of same order as b with 
all elements zero and I& is an identity matrix with the same order as b. 

Proposition 2: If every member of a family A = {h(o:, ■) : a £ ft} satisfies 
conditions (13.11) and (I3.2p for fixed itq and So, then A is location and scale invariant. 

Proof: Suppose every member h of the family A satisfies conditions (13 .ip and 
(13. 2p for fixed 7r and s . Define h*(a, rf) = h(a, a + hrf), then at rj* = b _1 (?7 — a), 



h*(a, rf) = h(a, rj ) = tt V a. £ fl, 

where 0„ and b ^ b or I 6 . Thus, equation (13.11) is satisfied by h* at rj = rf* . 
Also, 



dh*(a,ri) <9h(a, a + b^) dh(a,ri) 

\v=v* = |r7=r7* = b — \ v=rt0 = bs t s o, 

for b 7^ If,, implying h* ^ A for a ^ a and b ^ Of, or If,. Hence the family A is 
location and scale invariant. 



3.1 Construction of (7r , s )-standardized link families at r] Q 

y A sa tisfying conditions ( 13. ip and ( 13. 2 p is called (7r , s )-standardized at rj Q 



A fami 


y A sa 


(Czado. 


1997) 
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Proposition 3: Suppose G(a,r]) = [Gi(oci, rji), . . . , G q (oc q , r} q )]' where ol = 
[ai, ■ ■ ■ , ot q ]' and r] = [rji, . . . ,r] q ]', such that each {Gj(ctj,r}j), ocj G flj} is a gen- 
erating family for binary response models and are (fioj, s 0j )-standardized at T] j for 
j = 1,2, ...,q. Then the family A g = {G(a, •) : ct G fl = Hi x . . . x Q q } 
is (tt , s )-standardized at r} , where rf = [r] 01 , . . . , rj 0q ]', tt = [fJ, i, • • • , {J>o q ]', and 
s = diag{s 01 , . . . , s 0q }. 

Proof: Since, Gj(oij,r]j) is (/i j, s j)- standardized at rj j, Gj(ocj, r) j) = /x j, V otj G 
ttj, and 9G3 ^' r,) \ ri=mj = s 0j , V a, G Thus, G(a,r7 ) = [^ i, ^02, • • • , Hoq]' = 
7r , V a G f2, and ^^'^ = diag{s 01 , s 02 , s 0q } = s , V a G 17. Hence, the 
family A g is (ttq, s )-standardized at r] . 

For using a (iz , s )-standardized generating family at T) Q three parameters, 7r , s 
and T7 , need to be estimated. Avoiding estimation of extra parameters, the generating 
family can be standardized by choosing 7r = /3 n , r)n = @n and s = I. This selection 



19971 ). when centered covariates 



allows for a meaningful interpretation of 7r (jCzadol . 
are used. The (7r = /3 , s = I)-standardized at f] = (3 = [/3io • • • , /3 9 o]' generating 
family is denoted by G c (c>!, 77), where the jth component of G c is, 

G cj (ctj,r]j) = (3 j0 + G(ocj,r] cj ), r] cj = r]j - (3 j0 - 

Here, G(atj, •) is a (/io = 0, Sq = l)-standardized at 7/0 = generating family for 
binary response models and (3j is the intercept parameter for the jth response. 

Using condition (12. 2ft . the family of link functions, h(a,7?), for the multinomial 
response model is (7r , s )-standardized at r] when 7r j = i + ^ P ^xp(/3o) ' an( ^ s o = 
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c is a constant matrix with its (j, k)th element equal to, 



Cjk 



7i 0j (l-7i 0j ) if j = k 



As an example if we are using the generating family as suggested by lCzadol (119891 ) for 
binary response models as our Gj, the generating family for multinomial responses is 
then (7r = f3 Q , s = I)-standardized at r] = f3 



G cj(<*j,Vj) = (3 j0 + 



i 



" i 1 



•>Jl 

a j2 



if Vcj >0 

if i] cj < 0, 



(3.5) 



where i] cj = rjj - p j0 , for j = 1, . . . , q. 

In our numerical example (given in Section |5]) we observe that the variance in- 
flation ratios are reduced when a (n = (3 ,s = I)-standardized generating family 
at 77 = (3 is used instead of using a generating family which is (7r = 0, s = I)- 
standardized at r) = 0. Also, we observe that the Newton-Raphson iteration method 
does not converge when the (7r = 0, s = I) -standardized generating family at r) = 



Stukell ( 119881 ) has to be implemented. For 



is selected, and the grid selection method of l 
the generating family (ir = /3 , s = I)-standardized at rj = j3 Q the Newton-Raphson 
algorithm however converges. The estimation of unknown parameters requires less 
computational time when the Newton-Raphson method converges instead of using 
the grid searching method. Hence, we are able to show in our example that the nu- 
merical stability is increased and the computational time is reduced when using a 
(71-0 = /3 , s = I)-standardized at r] = j3 generating family . 
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4 Interval estimation of the percentiles 



Suppose we define S^^) as the settings of the control variables at which 7r = 
h[a,i7(x)], 

S K0 (8) = {x g R k \ 7Tq = h[a,i7(x)]}. (4.1) 

Then, S 7ro (<5) can be called the 7r th percentile of the multinomial distribution. In 
this section we propose three different methods for determining confidence regions for 

S 7Tq(<5)- 

4.1 Method 1: an asymptotic conservative confidence region 
based on ML estimates 

From Section Owe know that y/N(S-S) ~ MVN(0, S(<5) = iVJ" 1 ) asymptotically. 
This implies, yN(j3j — (3j) follows an asymptotic multivariate normal distribution 
with mean p . and variance Ej, here is a sub matrix of S(<5) corresponding to 
Using the normality of y/N (P j - (3j) we get, N{fi j - PjY'S^iPj - f3j) ~ x% asymp- 
totically. Thus, Pr[N0j - PjY^iPj - < 4, (1 _ r) ] = (1 - r), where X%, {1 . t) 
is the (1 — r)th quantile of a Xp distribution. Using Cauchy Schwarz inequality, 

sup — f i W? \ — - sup ^ 

= Nfa-PjYVffa-Pj), 

(4.2) 



thus, 



iV^x)^.-/^ 
f,(x)S J fj(x) 



< N(Pj - PjYEj^Pj - Pj) for all xeR k 



(4.3) 
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Suppose we define two events Aj and Bj as, Aj = 



fj .(x)E,.f'(x) ^ ^Pj,(l-r)' V X fc ^ 

. From equation (I4.3p . we know that 



Bj C Aj, thus, 



P[^(x) G C,(x), Vxe^]>(l-r), for all j = 1, 2, . . . , q, 



where C,(x) = {£ G P : Ly(x) < £ < Uj(x)}, 



L,(x) 



f,-(x)^ 



JV-if i (x)E^(x)4 >(1 _ T)> 



(4.4) 



Then, using Boole's inequality, 



Prfe(x) G C^x), V x G P fc , for all j = 1, 2, . . . , q] > (1 - qr), 



which implies, 



Pr[r/(x) G C(x), V x G P fe ] > (1 - gr), 



(4.5) 



where C(x) = x' =1 Cj(x). If we now denote P Lj (x) = ming 6 e( x ) hj(oc, £) and 
Pf^-f x) = max£ eC ( x ) hj(oc, £), for j = 1,2, ... ,q, then using the result given in ( iRaol . 



1973 



p 240), 



Pr[PLj(x) < ^-(a, f]) < %(x),Vx G P fe and V j = 1, . . . , q] > (1 - qr), 
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This implies that, 

Pr[P L (x) < h{a, rj(x)} < P,(x),Vx G R k ] > (1 - gr), (4.6) 

where Pl and Pj/ are g dimensional vectors with their jth elements equal to Plj and 
Pu,ji respectively. Since the link parameter cx is unknown, we use a (MLE of a) for 
computing P L j and Pu,j, f° r 3 — lj 2, . . . , g. 

Estimating S,r () (<5) by S,r (<5), we get a approximate 100(1 — r')% (r' = gr) con- 
servative confidence region for S^ ((5) as 

{CG J R fe :P L (x) < h[a, 77(C)] <Pc/(x) for all xeS„ (i)}. (4.7) 



4.2 Method 2: confidence region using the likelihood ratio 
test 

We derive the confidence region of S,r (<5) using the likelihood ratio (LR) test corre- 
sponding to the hypotheses, H : x G S,r (<5) versus i?i : x ^ S,r (<5). Under the null 
hypothesis we have, 



t/(x) = g(a,7r ) 



which implies 



;(a,7To) for j = 1,2, . . . , g. 



Suppose -D(x) is the deviance (IFahrmeir and Tutz 



2001 



p 108) under the null hy- 



pothesis while -D(x) is the deviance of the fitted model. Then, the LR statistic 



15 



L(x) = -D(x) — -D(x) has an asymptotic \ 2 distribution with q degrees of freedom. 
The 100(1 — r)% confidence region for S 7ro (<5) using the LR statistic is given by 



{x G R k : L(x) < xU- T }- 



(4.8) 



4.3 Method 3: confidence region using the score test 



Suppose, /3 = [/3io, . . . , (3 g0 }' and u 



dl 



dp 



a\8n 



. Let So be the estimated variance of 



u at 8 = Sq, where 1(8) is the log-likelihood function and Sq is the MLE of 8 under 



Hq in Section 14.21 



degrees of freedom ( iFahrmeir and Tutz 



hen s(x) = UnS n 



u n ha s an asymptotic \ 2 distribution with q 



2001 



p48). 



Using the score test, the 100(1 — r)% confidence region for S 7ro (<5) is given by 



{x G R k : S (x) < xli-rl 



(4.9) 



5 Example 



We c onsider a data set based on a combination drug experiment reported by (IGennings et al 



1994 



pp. 429-451). The main goal of the experiment is to study and model the rela- 
tionship between the dose levels of two drugs, morphine sulfate and A 9 -tetrahydro- 
cannabinol (A 9 -THC), on the pain relief and toxic responses of male mice. Eighteen 
groups of male mice (six animals per group) were randomly assigned to receive the 
treatment combinations and three responses were recorded. They were, E: the num- 
ber of mice in each group who exhibit only pain relief (no toxic effect), T: the number 
of mice in each group experiencing toxic effects (irrespective of pain relief), and W: 
the number of mice who experienced neither pain relief nor any toxic effects. So we 



may consider E and T as the efficacy and toxicity responses of the two analgesic 
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drugs. The dose levels of the two drugs formed a 3 x 6 factorial design, where a 
treatment combination consisted of a single injection using one of three levels of mor- 
phine sulfate (2, 4, 6 mg/kg) in addition to one of 6 levels of A 9 -THC (0.5, 1.0, 2.5, 
5.0, 10.0, 15.0 mg/kg). The centered dose levels of the two drugs morphine sulfate 
and A 9 -THC were denoted as x\ and x-^. The 3x6 factorial design D and the three 
responses are given in Table [TJ Since the responses were obtained from the same 
mouse they may be correlated. The binary nature of the responses allowed us to 
model them using a two category multinomial model. The response W was taken 
correlated binary to be the dummy category. Fo r more details on modeling bina ry 



responses using the multinomial distribution see ( IMukhopadhyay and Khuril . 



20081 ) 



Table 1: Design D and responses = [E^, T i; Wj\' . There are rii = 6 experimental 
units for each run. 

D Responses 





X2 


Ei 


Ti 


Wi 


-1.0 


-0.713 


5 





1 


-1.0 


-0.646 


2 





4 


-1.0 


-0.433 


2 





4 


-1.0 


-0.093 


4 


1 


1 


-1.0 


0.597 


5 


1 





-1.0 


1.287 


3 


3 





0.0 


-0.713 


5 





1 


0.0 


-0.643 


6 








0.0 


-0.433 


5 


1 





0.0 


-0.093 


3 


3 





0.0 


0.597 


3 


3 





0.0 


1.287 


3 


3 





1.0 


-0.713 


6 








1.0 


-0.643 


6 








1.0 


-0.433 


6 








1.0 


-0.093 


6 








1.0 


0.597 


1 


5 





1.0 


1.287 





6 
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5.1 Fitting a generalized multinomial model 

We start by fitting a multinomial regression model with the multicategorical logit link 
function to the data. The model is given by 



?7 E (x) = 1O + 0uXx + P12X2, 

77t(x) = 020 + fclXl + 02222- (5.1) 



The maximum likelihood estimates (MLEs) of (3 and their standard errors are re- 
ported in Table [2l The scaled deviance for the above fitted model is 29.6048 with 12 
degrees of freedom (p-value=0.0032). Since the p- value is 0.0032, the results show 
evidence of lack of fit. We thus consider the proposed parametric link functions for 
the multinomial models and see if it is possible to improve the fit. We use two choices 
for T7 , the fixed choice of rj Q = and later rj Q = f3 . 

Table 2: Maximum likelihood estimates and standard errors of the parameters in 
model (15. ip . 



Parameter 


Estimate 


Std. error 


p- value 


0io 


4.5275 


1.2203 


0.0002 


0n 


2.9644 


1.0566 


0.0050 


012 


2.5160 


1.2987 


0.0527 


020 


2.8197 


1.2582 


0.0250 


/?21 


3.6704 


1.1231 


0.0011 


022 


4.7535 


1.3761 


0.0006 


Deviance=29.6048. 



The multinomial model with a parametric link function considering rj to be fixed 
at is given by 



= exp^.^rfa)} ^ iQIJ = E T 

1 + Ez=i ex p{ G «(«i>%)} 
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where f ICzadol . Il989f ) 



if rjij > 



(5.3) 



if 77^ < 0, 



where a,- = [an, a,^]' for j = i?,T. The above link function becomes equivalent to 



the mufticategorical logistic 



by (IFahrmeir and Tutz 



2001 



ink function when a = [1, 1, 1, 1]'. Using the score test 
, p 48), we test the hypotheses, 



Hq : ctjk = 1 versus H 1 : ^ 1 V j = 1, 2 and k = 1, 2. 

From results of the score tests we observe that the null hypotheses are rejected for an 
and ai2- This implies that there is a need to modify both tails of the first response. 
Stepwise selection of each link parameter based on the akaike information criterion 
(AIC) were considered in the score tests. For comput i ng th e MLE of the parameters, 



Stukell (Il988f ) since the Fisher scoring 



8 = (f3', ct')' we use the method detailed in 
iterative method does not converge for r) = 0. The parameter estimates, standard 
errors and variance inflation ratios are given in Table EJ The computations are done 
by once considering a fixed in the information matrix and later estimating it from 
the data set. 

We also consider the parametric link function with rj Q = (3 (refer to equation 
(13. 5p ). Using score tests we again note that the link parameters an and ayi need to 
be included in the model. The Fisher scoring iteration method for obtaining MLE of 
8 converges and the results are given in Table |3j 

From Table [3] we note that the deviance using parametric link function for a gen- 
erating family standardized at r) = is 23.8866 at 10 degrees of freedom, and for a 
generating family standardized at rj = f3 is 22.9148 at 10 degrees of freedom. Since, 
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we have estimated two extra parameters for using the parametric link function, so 
the difference between the deviances using logistic link fu nction and parametr i c link 



2001 



function is a \ 2 distribution with 2 degrees of freedom (IFahrmeir and Tutz 
p 49). The differences between the deviances using a multivariate logistic link func- 
tion and parametric link function with generating families (15. 3p and (13 .5p are 5.7182 
(p- value 0.0573) and 6.69 (p- value 0.0353), respectively. This shows that using the 
parametric family of link functions with generating family (13. 5ft . we are able to sig- 
nificantly improve the fit over the multicategory logistic link function. Also in Table 
|3l we report the estimates of parameters and two estimated standard errors for each 
regression parameter for both the generating family. The first one assumes that the 
link parameters are fixed at their estimated values and second one assumes that the 
link parameters are estimated from the data set. The variance inflation ratio is the 
ratio of the standard error when a is estimated to the standard error when a is 
fixed. From Table [3] we note that the variance inflation ratios corresponding to the 
parametric link function with rj = are higher than those corresponding to r] = f3 . 
This implies that we achieve greater numerical stability when using a parametric link 
function with rj Q = f3 . 



5.2 Percentile estimation 

In this section we apply the three methods of interval estimation and find confidence 
regions for tvq = [0.75, 0.2]'. Thus, we are interested in jointly estimating the -E-D75 
and LD20 percentiles (where ED=Effective Dose and LD=Lethal Dose). For com- 
puting confidence intervals, we use MLE of 8 for generating family (tz = f3 , s = I) 
standardized at r) = f3 as it provides better numerical stability and also smaller 
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Table 3: Maximum likelihood estimates, standard errors and variance inflation ratios. 



Parameter 


r/ = 




r] = f3 






Estimates 


Variance 


Estimates 


Variance 




(S.E.) 


inflation 


(S.E.) 


inflation 


010 


17.4866 




6.3931 




a fixed 


(6.8881) 




(1.4117) 




a estimated 


(13.6982) 


1.9887 


(1.6786) 


1.1891 


0n 


14.1469 




17.3919 




a fixed 


(6.0235) 




( 8.4866) 




a estimated 


(11.9014) 


1.9758 


(9.6415) 


1.1361 




21.5053 




15.0282 




a fixed 


(9.8690) 




(7.0234) 




a estimated 


(20.2745) 


2.0544 


(8.0906) 


1.1520 


020 


12.3022 




5.1825 




a fixed 


(4.9916) 




(1.5889) 




ex estimated 


(16.7199) 


3.3496 


(1.9165) 


1.2062 


021 


10.8918 




7.2412 




a fixed 


(3.9276) 




(2.2177) 




a estimated 


(14.6531) 


3.7308 


(2.8268) 


1.2747 


022 


17.7634 




7.8120 




a fixed 


(6.5569) 




(1.8852) 




a estimated 


(24.0555) 


3.6687 


(2.7628) 


1.4656 


"ii 


0.9 




0.57 




a estimated 


( 0.2218) 




(0.1827) 




«12 


-2.9 




0.35 




a estimated 


(4.0716) 




(0.1519) 






Deviance = 23.886 


6 


Deviance = 22.9148 





variance inflation ratios. The estimated 7r th percentile is given by 

S no (S) = {xe J R 2 :7r = h(a,f/(x))} 

= {[-0.6715,0.1365]'} = {x } , say, (5.4) 



which is a singleton set. In our example we have two categories and three regression 
parameters in each category, thus q = 2 and pj = 3. For obtaining 95% confidence 
regions for S,r (<5), we choose r' = qr = 0.05, which gives r = 0.025 and ^_ T ^ = 
9.35, j = E,T. 
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Since S,r (5) = {x }, the approximate 95% conservative confidence region for 
S 7ro (^) using method f (Section |4.ip is given by 



{x=[x 1 ,a; 2 ] / ei2 2 :P L (xo) < h[a, r)(x)] < P C7 (x )}. 



(5.5) 



For computing the above region we first need to find the intervals for i]e and t)t- 
Using equation we have the intervals [-4.6794,-1.7894] and [1.0720,1.7006] 

for rjE and rjr, respectively. For calculating the intervals we use N = 108 (to- 
tal number of observations), /3- and Ej from Tables [3] and H] respectively. To get 
C(xo) we take the cartesian products of the intervals of t]e and t]t- The next step 
is to compute P L (x ) = [P L> i(x ), P ii2 (x )]' and P[/(x ) = [P^^xq), P { / i2 (x )] / where 
Plj(x ) = min^ eC(xo) hj(a,g) and P[/j(x ) = max£ eC(xo) /^(a, £), for j = P,T. 



For computing the minimum and maxim um of the function h 



A, £) over C(x ) 



we use a MATLAB program called MCS (jHuyer and Neumaierl . I1999T 1 and we get 
P L (x ) = [0.6319,0.1182]' and P{/(x ) = [0.8414,0.3113]'. Hence, the 95% conserva- 
tive confidence region for S^^) by method 1 is given by 



{x = [x lt x 2 ]' e R 2 : [0.6319, 0.1182]' < h[a, t)(x)] < [0.8414, 0.3113]'} (5.6) 



Table 4: Estimated variance of 6, when model ( 15. ip is fitted using a parametric family 
of link functions with generating family ( 13.51) . 





010 


0ii 


012 


020 


021 


022 


an 


«12 


fro 


2.8176 


7.5455 


6.5334 


3.0072 


3.3696 


2.9374 


0.1117 


0.1346 




7.5455 


92.9582 


73.5399 


9.5858 


18.6817 


13.7987 


-0.3194 


-0.6891 


/3l2 


6.5334 


73.5399 


65.4584 


9.0370 


16.7079 


15.5723 


-0.0182 


-0.5365 


/3 20 


3.0072 


9.5858 


9.0370 


3.6728 


4.5693 


4.1738 


0.1884 


0.1163 


021 


3.3696 


18.6817 


16.7079 


4.5693 


7.9907 


6.9814 


0.2904 


0.0229 


/3 22 


2.9374 


13.7987 


15.5723 


4.1738 


6.9814 


7.6333 


0.3409 


0.0393 


«n 


0.1117 


-0.3194 


-0.0182 


0.1884 


0.2904 


0.3409 


0.0334 


0.0138 


«12 


0.1346 


-0.6891 


-0.5365 


0.1163 


0.0229 


0.0393 


0.0138 


0.0231 



22 



1.5 



Confidence regions 




-1 ' 1 1 1 1 1 1 1 1 1 1 

-1 -0.8 -0.6 -0.4 -0.2 0.2 0.4 0.6 0.8 1 

X l 



Figure 1: Confidence regions for the 7r th percentile using the three interval estimation 
methods. 

From equation (14. 8p . the 95% confidence region of S 7ro (<5) using the LR test (Sec- 
tion 14. 2p is given by 

{x = [xi, x 2 ]' G R 2 : L(x) < 5.99}, (5.7) 

since for r = 0.05, xii-r = 5.99. The 95% confidence region of S^^) using the 
score test (see equation (j4.9|) ) is, 

{x = [xi,x 2 ]' G R 2 : s(x) < 5.99}, (5.8) 
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where s(x) is defined as in Section H~3l 

The confidence regions for the percentiles using the three methods are graphically 
shown in Figure [TJ For plotting the confidence regions we choose 21 values of x\ from 
the interval [—1,1] at steps of 0.1. For each chosen x±, 1000 values of x 2 are chosen 
randomly from [—0.7132,1.2868]. Let S Xl be the set of the selected x 2 values. To 
determine the confidence region by method 1, for each value of X\ we compute 

L X2 ( x i) — rnin{x 2 G S Xl : [xi,£ 2 ]' is in region (15. 6p } 
and 

U X2 (xi) = max{x2 G S X1 : [xi,x 2 ]' is in region (15.61) } (5.9) 

Then, by plotting L X2 (x-\) and U X2 (xi) against x± we get the lower and upper bounds 
of the region, respectively for method 1. We use the same methodology as above to 
plot the confidence regions by the LR and the score test. From Figured], we observe 
that the confidence regions found by using LR test is narrower than the other two 
confidence regions, while the score test gives the widest region. 

6 Concluding Remarks 

In this article, we have introduced a family of link functions which is location and scale 
invariant and provides local orthogonality between regression and link parameters for 
multinomial response models. Using a numerical example we are able to show that 
parametric link function provides a better fit over multivariate logistic link function. 
We also discussed three different methods for constructing 100(1 — r)% confidence 
regions for the 7rth percentile. 

The percentile estimation methods for multinomial models discussed in this article 
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can be used in clinical trials which are involved in determining dose le vels having de 



sired probabilities of both toxicity and efficacy, namely Phase I /II trials (IGooley et al. 



19941 ; 



Thall and Russell 



1998 



O'Quiglev et al. 



20011 ) . By applying the above inter- 



val estimation methods experimenters will be able to find confidence regions of dose 
levels with tolerable toxicity and the desired efficacy. 

There has been a recent rise of interest among researchers t o find designs for 



l ogisti c reg ression models which are robust to link misspecification. 



Biedermann et al 



( 120061 ) and 



Woods et al. 



pl ausible link functions w hile 
of 



(l2006f) propose robus t desi 



gns by considering a finite set of 



Adewale and Xul (120101 ) uses the family of link functions 



Aranda-Ordazl (1198 if ) in their approach. In the future we plan to use the family 
of link functions proposed in this article to determine designs for multinomial models 
which are robust to an incorrectly assumed link function. 
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